Workbench Add-On

home *** CD-ROM | disk | FTP | other *** search

/ Workbench Add-On / Workbench Add-On - Volume 1.iso / Dev / Oberon / texts / SymbolFiles.Text < prev next >

Wrap

Text File | 1994-02-09 | 54.1 KB | 1,367 lines

Departement Informatik Institut für Computersysteme Eidgenössische Technische Hochschule Zürich Robert Griesemer On the Linearization of Graphs and Writing Symbol Files Cuno Pfister (ed.) Oberon Technical Notes Beat Heeb Josef Templ March 1991 Address of the authors: Computersysteme ETH-Zentrum CH-8092 Zurich, Switzerland e-mail: griesemer@inf.ethz.ch pfister@inf.ethz.ch Copyright (c) 1991 Departement Informatik, ETH Zürich On the Linearization of Graphs and Writing Symbol Files Robert Griesemer Abstract The linearization of general graphs represented by pointer and record data structures is a problem often arising in computer programs. Whenever a graph has to be stored on or transmitted over a sequentially organized carrier, a form of linearization is used. A simple algorithm for this purpose is presented and a special application - the writing of symbol files as required by modern language compilers - is described in more detail. Keywords: Linearization, Graphs, Symbol Files. 1. Linearization of Graphs General graphs occur in various forms in computer science, sophisticated data structures may often be interpreted as graphs. Whenever such a data structure has to be stored on a file or has to be transmitted over a network, the linearization problem occurs. For special forms of non-linear data structures (e.g. trees) well-known solutions exist and are comprehensively described in the basic literature. Nevertheless, for more general data structures the wheel has to be reinvented and literature is difficult to find [ReMö86]. The problem has a twofold nature: firstly, the graph has to be linearized by means of a write algorithm , and secondly, it has to be rebuilt out of its linear description by a read algorithm . In the following, linearization means writing a graph to a file. 1.1 Preconditions First of all, we remember that every general (undirected) graph may be represented by a directed graph, by means of having two directed edges instead of an undirected one. Hence, we concentrate on directed graphs only. Secondly, we consider only rooted and connected graphs; i.e. graphs with a special root node and a path from the root to every other node. Unconnected graphs or graphs with several root nodes are easily extended such, that they obey the above preconditions. Thirdly, as an (academic) restriction only finite graphs are considered. In computer programs, graphs may be represented in various forms, depending on the available notational support (the programming language) and the way the data structure is used (the program). Here we consider only graphs described in terms of pointers and records; i.e. every graph node is represented by a record and every edge is represented by a pointer. Note that possible information attached to the edges of a graph ( labeled edges ) may be interpreted as additional node data. The nodes of such a graph may not all have the same type; i.e. the amount of data in every node may be different and the number of directed edges to other nodes may vary. It depends on the application and available language how different nodes are specified. For the sake of simplicity we identify each node with a positive tag -number (> 0), so that every node type is clearly determined by its tag and vice versa. Then, every node contains a certain amount M(tag) of data and (at most) a certain number N(tag) of directed edges to other nodes. Hence, a general graph node and its edges may be described in the following way (written in Oberon [Wi88]): TYPE Node = POINTER TO NodeDesc; NodeDesc = RECORD tag: INTEGER; (* determines node type; tag > 0 *) data: ARRAY M(tag) OF INTEGER; (* M depends on the node type (tag) *) link: ARRAY N(tag) OF Node (* N depends on the node type (tag) *) END; Note that any data structure represented in any way in computer memory may be written to a file by simply writing out sequentially the corresponding memory area. Reading requires "only" the data to be read into the same memory area (otherwise pointers would be incorrect). But normally this is an inappropriate solution for several reasons: often the data structure is distributed over the entire available memory and writing would lead to an immense waste of space. For reading, the destination memory addresses must be specified which is almost never possible and a file created this way is inherently not portable (to another computer system). Hence, writing of graphs requires pointers to be transformed into another form of reference and vice versa for reading. In addition, reading and writing should be as efficient as possible considering both memory and time; e.g. each node description should appear once and only once on the file. 1.2 The Write Algorithm The write algorithm resembles closely a naive recursive mark-algorithm for garbage collectors, actually it is nearly the same task to be done: all nodes have to be traversed and marked; marking is necessary to avoid a second traversal and to refer to the node's first occurence. While traversing the graph its structure is written to a file. Hence, the node data structure has to be extended by a mark field, which initially must be zero: TYPE Node = POINTER TO NodeDesc; NodeDesc = RECORD mark: INTEGER; (* used for linearization; initially = 0 *) tag: INTEGER; (* determines node type; tag > 0 *) data: ARRAY M(tag) OF INTEGER; (* N depends on the node type (tag) *) link: ARRAY N(tag) OF Node (* M depends on the node type (tag) *) END; We say a node q is marked iff q.mark > 0. The WriteNode procedure (see below) traverses the graph in pre-order, starting with the root node. Whenever an unmarked node is encountered, the node is numbered and thus marked (line 4), the node tag and data are written (lines 4 and 5) and its successor nodes are traversed (line 6). Note that a node must be numbered before its subtrees are traversed because they may refer to it. Numbering is done using the counter NofNodes which is one initially and is then incremented by one with every processed node. Hence, NofNodes is always greater than zero. Whenever an already marked node is traversed, only its negative node number is written out as reference number (line 2). If the node doesn't exist (i.e. q = NIL) zero is written out (line 1): VAR NofNodes: INTEGER; (* node number; initially = 1 *) PROCEDURE WriteNode(q: Node); VAR i: INTEGER; BEGIN (1) IF q = NIL THEN WriteInt(0) (2) ELSIF q.mark > 0 THEN (* already marked *) WriteInt(-q.mark) (3) ELSE (* first occurence *) (4) WriteInt(q.tag); q.mark := NofNodes; INC(NofNodes); (5) i := 0; WHILE i < M(tag) DO WriteInt(q.data[i]); INC(i) END; (6) i := 0; WHILE i < N(tag) DO WriteNode(q.link[i]); INC(i) END (7) END END WriteNode; During execution of WriteNode with argument root all nodes are numbered in the order of their first occurence. References to already written nodes are the negative node numbers. While reading the file this information will be used to reconstruct the graph (see 1.3 The Read Algorithm). Before WriteNode can be called a second time for the same graph or parts of it, the preconditions have to be established again, i.e. the nodes must be unmarked. When using time-stamps, no such unmark pass is necessary (due to an idea of J. Templ, see [PfiHeTe91]: A Symmetric Solution to the Load/Store Problem ). The WriteNode procedure leads to the following file structure (in EBNF, terminal symbols are described in double quotes): Graph = NodeRef | NoNode | NodeDesc. NodeRef = "negative node number (< 0)". NoNode = "0". NodeDesc = "node tag (> 0)" "node data" {Graph}. It is obvious that the algorithm terminates for all graphs: there is only a finite number of nodes and every procedure call works on at least one node. The recursion is stopped when an already marked node and hence a loop is found (line 2) and the algorithm returns to its predecessor node. In line 4 the assignment q.mark := NofNodes is executed only if q.mark = 0, i.e. each node is numbered at most once (because NofNodes > 0). On the other hand, since after numbering of the node q^ (line 4) WriteNode numbers all subgraphs of q^ (line 6), each node is numbered at least once. In combination we may conclude that each node is marked exactly once. Finally, because each node's data is written iff the node was numbered immediatly before (line 5), it becomes clear that each node is written only once. Note that WriteNode is a one-pass algorithm and that no additional data structures except NofArgs and the procedure activation stack are required. The run time complexity is O(n) where n is the number of edges in the graph. As we mentioned above, WriteNode resembles closely the mark phase of a mark-and-sweap garbage collector, therefore it is possible to transform the algorithm into a completely iterative form (e.g. if stack space is critical). The analoguous transformations for a simple garbage collector are described in [PfiHeTe91]. 1.3 The Read Algorithm The read algorithm reconstructs a (sub-) tree from its description on a file. Each node block in the file starts with an identification tag. According to the EBNF file structure there are three cases to distinguish: the node tag may be zero, negative or positive (excluding zero). VAR NofNodes: INTEGER; (* node number; initially = 1 *) NodeTab: ARRAY MaxNodes OF Node; (* node table; NodeTab[0] = NIL *) PROCEDURE ReadNode(VAR q: Node); VAR tag, i: INTEGER; BEGIN (1) ReadInt(tag); (2) IF tag <= 0 THEN (* node reference *) q := NodeTab[-tag] (3) ELSE (* first occurence *) (4) NEW(q, tag); NodeTab[NofNodes] := q; INC(NofNodes); (5) i := 0; WHILE i < M(tag) DO ReadInt(q.data[i]); INC(i) END; (6) i := 0; WHILE i < N(tag) DO ReadNode(q.link[i]); INC(i) END (7) END END ReadNode; The first and second case are handled together by initializing the (otherwise unused) NodeTab entry zero to NIL. Because an empty subtree is identified by a zero tag, q then automatically becomes NIL (line 2). A node that occurs for the first time is identified by a positive number which is also its node tag. Then, such a node is created using NEW and stored in a global node table NodeTab (line 4). Note again that this assignment must be done before any subtrees are read (they may potentially refer to it). After creation of the new node its data and its subtrees are read (line 5 and 6). A negative node tag refers to an already read node, with the node number -tag. This node is obtained from the NodeTab array (line 2). It should be clear that this algorithm rebuilds the original graph, as each line in ReadNode has its counterpart in WriteNode and each call of WriteNode during writing implies a corresponding call of ReadNode during reading. Clearly this algorithm is also O(n) where n is the number of edges in the graph. A temporary node array of size n is used for reading, this might be a drawback in case of very large graphs. A 2-pass algorithm which needs temporary storage only for nodes which are referenced more than once is described in the appendix. 2. Writing Symbol Files Today's modular programming languages like Modula-2, Ada, Oberon and others allow an application to be programmed in several more or less independent parts, so-called modules (or packages). Such a module typically defines data structures and provides operations (procedures) on them, these exported objects may be used ( imported ) by other modules without knowledge of their implementation. Therefore, the information about a module's interface has to be available to the compiler in an efficient way: in Modula-2 and Oberon the necessary data is recorded on so-called symbol files. For historical reasons and to avoid confusion we use the term symbol file for any linear data structure which describes a module's interface. Actually, it is not necessary that symbol files are represented by a separate file (see 2.4 Implementation Aspects). During compilation, the information about a module's interface is held in the symbol table of the compiler. The symbol table can be regarded as a graph, hence writing a symbol file requires linearizing the necessary partial graph of this table. The methods described in the following have been implemented in a compiler for an experimental Oberon-like language [Gri90]. The code fragments detail only the principal structure, a complete implementation may be found in the appendix. 2.1 Structure of Symbol Tables The symbol table of a compiler for an Oberon-like language describes the compiled objects, which are constants, types, variables, procedures, or modules. The table itself is built when processing declarations. These may be nested, so the compiler has to manage a stack of scopes, i.e. visibility ranges of objects. For our purposes, it is sufficient to store the objects of a scope in a linear list. More sophisticated implementations would use structures like binary trees, because objects have to be searched efficiently during compilation. In Modula-2 and Oberon only global objects may be exported, i.e. only objects within the global scope of the symbol table have to be considered for export (Fig. 1). Hence, writing the symbol file means linearizing the partial graph described by the exported (and therefore in some way marked) objects of this scope. Fig. 1 Global Scope Each node represents an object of the compiled program, containing all the necessary information about it, such as its name, the object kind, possibly an address and its type. Object types itself may have a very complex structure and are further described using a Struct data structure, which again may refer to other Struct nodes [Wi85, Cre90]. Hence objects and types are described with two different record structures: CONST (* object modes *) Undef* = 0; Scope = 1; Const* = 2; Type* = 3; Var* = 4; VarPar* = 5; XVar* = 6; IVar* = 7; Field* = 8; LProc* = 9; XProc* = 10; IProc* = 11; SProc* = 12; Mod* = 13; (* type forms *) Char* = 2; Bool* = 3; SInt* = 4; Int* = 5; LInt* = 6; Set* = 7; Real* = 8; LReal* = 9; Cmplx* = 10; LCmplx* = 11; NilTyp* = 13; NoTyp* = 14; Proc* = 15; String* = 16; Array* = 17; DynArr* = 18; Record* = 19; Pointer* = 20; TYPE Name* = ARRAY 32 OF CHAR; Object* = POINTER TO ObjectDesc; Struct* = POINTER TO StructDesc; ObjectDesc* = RECORD link*, next*: Object; (* objects are chained using next *) name*: Name; typ*: Struct; marked*: BOOLEAN; (* marked objects are exported *) mode*: INTEGER; (* identifies object kind *) mnolev*: INTEGER; (* module numbers are <= 0 *) ... (* object specific data *) END; StructDesc* = RECORD ref: INTEGER; (* used as mark field *) form*: INTEGER; (* identifies structure kind *) obj*: Object; (* points to type object if it exists *) len*, size*: LONGINT; base*: Struct; (* result-, element-, base- or pointee type *) link*: Object (* record scope *) END; Struct nodes describe the type of an object (array, record, pointer, procedure); a few types are predefined (e.g. Char, Integer, Real). Because types may be recursively defined, the resulting data structure may contain cycles. Many objects may have the same type and therefore refer to the same struct node. 2.2 Export The graph linearization algorithm in the form described in section 1 actually works for different node kinds (determined by the tag field) but requires that all nodes are described using the same record. However, enforcing this precondition would have a far too strong impact on the structure of a compiler. As we have actually two different record types for object and type description, the adequate solution is to have two sets of read and write procedures, one set for objects and one for types. This distinction is even more justified by the fact that objects are stored in a simple linear list and are referenced only once (with the exception of type and module objects) while type descriptors may be referenced several times and potentially belong to cycles. References from struct nodes to their type objects are handled directly in the WriteStruct procedure. Modules are never marked for export and written using a special procedure. As a consequence, the write procedure which traverses the object scope does not have to take care of objects already traversed and therefore no marking is necessary. Hence, the write procedure degenerates to a simple list traversal while for the corresponding read procedure no temporary array is necessary. Using the above definitions, the WriteObjects procedure can be written as follows (primitive operations like WriteInt and WriteName may be found in the appendix section): PROCEDURE WriteObjects(obj: Object); BEGIN WHILE obj # NIL DO IF obj.marked THEN WriteInt(obj.mode); IF (obj.mode # Type) OR (obj.typ.obj # obj) THEN (* no-type or alias type *) WriteName(obj.name) ELSE (* other type *) Write(0X) END; WriteStruct(obj.typ); IF obj.mode = Const THEN (* write const value *) ELSIF obj.mode = LProc THEN (* write parameter list *) ... END END; obj := obj.next END; WriteInt(Undef) (* termination tag *) END WriteObjects; For exported objects, the mode which describes the object kind, the object's name and its type are written. As an optimization, the name of types is only written if it concerns alias types. Otherwise, the name will be written by the WriteStruct procedure (given below). Then, depending on the object mode, additional information is written (e.g. the value of constants or the parameter list of procedures). The WriteStruct procedure closely resembles the Write algorithm described in 1.2 but is slightly complicated by the fact that named types (i.e. types for which a type object exists) have to be handled correctly. Remember, that it is not correct to simply call the WriteObjects procedure, because WriteObjects is not built to handle more than one reference to an object. In addition, named types (and only these!) may be exported and imported over many modules and it must always be guaranteed that a type, whether it was imported via several modules ( indirect import ) or not, is described by one and only one struct node. A unique identification is necessary, which is the type name combined with the description of the module where the type was defined first. An analogous situation occurs when a type gets an alias name: then, the struct node always points to the first type object which defined the type. VAR nofStructs: INTEGER; (* structure number; initially = 1 *) PROCEDURE WriteStruct(typ: Struct); VAR name: Name; BEGIN IF typ = NIL THEN WriteInt(0) ELSIF typ.ref > 0 THEN (* already marked *) WriteInt(-typ.ref) ELSE (* first occurence *) WriteInt(typ.form); typ.ref := nofStructs; INC(nofStructs); IF typ.obj # NIL THEN (* named type *) name := typ.obj.name; IF ~typ.obj.marked THEN (* invisible type *) name[0] := CHR(ORD(name[0]) - ORD("@")) END; WriteName(name); WriteMod(GMod[-typ.obj.mnolev]) ELSE Write(0X) END; CASE typ.form OF | Proc: (* write parameter list and result type *) | Array: (* write element type and length *) ... END END END WriteStruct; Note that for exported named types there is a distinction between those with and those without corresponding exported type objects, where the latter are called invisible types. Invisible types must not be visible in an importing module, i.e. a programmer is not allowed to use them by name within a declaration. On the other hand they have to be visible to the compiler because it must be ensured that all invisible types with the same name are mapped onto a common struct node. Hence, the same information as for named types is written, but a simple trick inhibits any use of the type name within a program: the first letter of its name (which is always greater or equal than "A") is modified such that the name becomes a syntactically invalid identifier. For named types the declaring module has to be known. Because several types may be exported by the same module, the corresponding module object may be referenced more than once. A special procedure which handles modules correctly is used: VAR nofLMods: INTEGER; (* local module number; initially = 0 *) PROCEDURE WriteMod(mod: Object); BEGIN IF mod.ref < 0 THEN (* first occurence *) mod.ref := nofLMods; INC(nofLMods); WriteInt(Mod); WriteKey(mod.key); WriteName(mod.name) ELSE WriteInt(-mod.ref) END END WriteMod; Like struct nodes, module descriptors are only written at their very first occurence. Whenever the same module is referenced later, only its reference number is written. Module pointers are never NIL, so numbering starts with zero and a module is marked iff mod.ref >= 0. During import the compiler has to check that only one version of a module is used globally, therefore every module needs a unique key. Remember that modules are never written out accidentally by the WriteObjects procedure, because they are never marked for export. Writing the complete symbol file requires writing out the module descriptor of the compiled module (by means of WriteMod) followed by writing out all its exported objects (by means of WriteObjects). 2.3 Import Like the general ReadNode procedure is mirroring the structure of the WriteNode procedure, the import procedures are similar to the export procedures. This fact allows for easy extension, because additionally written data in an export procedure automatically leads to the corresponding read operations in the import procedure. Nevertheless, a few difficulties need to be mastered anyway. In several situations an object may be imported more than once: the trivial case occurs when the same module is imported twice because of a substitution (or alias) name. One might expect that this special case should be handled by simply ignoring a second import; but actually a multiple import of objects has to be handled anyway, hence double import of entire modules is only a special case of a more general situation. Multiple import of an object normally occurs for type objects, when beeing imported indirectly across different modules. Consider the following situation: a module A exports a type (object) T which is used in a variable V of a module B. Module C which imports A and B consequentially also imports the type T from A and the variable V from B and hence once again the type T as type of V (Fig. 2). Fig. 2 Double import However, multiply loading objects or structures, if not detected, could lead to incorrect incompatibilities during type checking. Therefore, each object which is read from the symbol file is discarded if it was already present in the symbol table. Hence, each object is represented by its very first loaded instance which is called primary instance [Gu85]. In consequence, the ReadObjects procedure reads an object from the symbol file but the procedure InsertImport inserts it in the corresponding module scope only if the object (with the same name) has not already been imported. As an optimization, note that only alias types have to be (additionally) inserted with InsertImport because their names differ from their type object names. All other types are handled in the ReadStruct procedure. PROCEDURE InsertImport(VAR obj: Object; scope: Object); VAR p, q: Object; BEGIN p := scope; q := scope.next; WHILE q # NIL DO IF q.name = obj.name THEN obj := q; RETURN END; p := q; q := q.next END; obj.mnolev := scope.mnolev; p.next := obj END InsertImport; PROCEDURE ReadObjects(scope: Object); VAR mode: LONGINT; obj: Object; typ: Struct; name: Name; BEGIN LOOP (* read all objects *) ReadInt(mode); IF mode = Undef THEN (* no more objects *) EXIT ELSIF mode = Type THEN ReadName(name); IF name # "" THEN (* alias type *) NewObject(obj, name, Type); ReadStruct(obj.typ); InsertImport(obj, scope) ELSE (* other types *) ReadStruct(typ) END ELSE ReadName(name); NewObject(obj, name, SHORT(mode)); ReadStruct(obj.typ); IF mode = Const THEN (* read const value *) ELSIF obj.mode = LProc THEN (* read parameter list *) ... END; InsertImport(obj, scope) END END END ReadObjects; The procedure ReadStruct decides upon reading the first number form whether the structure was already read from the (same) symbol file or not. In the first case, the structure already built is taken out of a local structure table LStruct which serves as a translation table for structure references analogous to NodeTab in the ReadNode procedure. In the second case, the structure is created and inserted in the LStruct table, then the specific structure information is read. As described in ReadObjects, named types must also be ignored if occuring a second time: the structure information is read but then discarded (by means of InsertImport) and the primary instance is used (and also inserted in the LStruct table). As a tricky point, notice further: Because each type (named or not) may refer to itself, it is absolutely necessary that the primary instance is found before any other types are read (which potentially refer to the same type and therefore would obtain the wrong structure out of the LStruct table). Hence, no other types must occur between the type tag of a named type and its identification, i.e. its name and its original module description. This rule corresponds to postulate 5 in [Gu85]. VAR nofStructs: INTEGER; (* structure number; initially = 1 *) LStruct: ARRAY MaxNofStructs OF Struct; (* local structure table; LStruct[0] = NIL *) PROCEDURE ReadStruct(VAR typ: Struct); VAR form: LONGINT; htyp: Struct; name: Name; obj, mod: Object; BEGIN ReadInt(form); IF form <= 0 THEN (* struct reference or NIL *) typ := LStruct[-form] ELSE (* first occurence *) NewStruct(htyp, SHORT(form)); ReadName(name); IF name # "" THEN (* named type *) NewObject(obj, name, Type); obj.marked := TRUE; obj.typ := htyp; htyp.obj := obj; ReadMod(mod); InsertImport(obj, mod.link); typ := obj.typ ELSE typ := htyp END; LStruct[nofStructs] := typ; INC(nofStructs); CASE form OF | Proc: (* read parameter list and result type *) | Array: (* read element type and length *) ... END END END ReadStruct; At the end, the ReadMod procedure is presented. As expected, a local module table LMod is used as translation table for module references. In addition, because modules must always be represented by their primary instances, a global module table GMod is necessary, which is accessed in the InsertMod procedure. VAR nofGMods*: INTEGER; (* global module number; initially = 0 *) GMod*: ARRAY MaxNofGMods OF Object; (* global module table *) PROCEDURE InsertMod* (VAR mod: Object; VAR name: ARRAY OF CHAR; key: LONGINT); VAR i: INTEGER; BEGIN i := 0; WHILE (i < nofGMods) & (name # GMod[i].name) DO INC(i) END; IF i < nofGMods THEN (* module already imported *) mod := GMod[i]; IF mod.key # key THEN err(150) (* key inconsistency *) END ELSE NewObject(mod, name, Mod); (* must not be visible in global scope *) mod.key := key; mod.mnolev := -nofGMods; OpenScope(mod.link, mod.mnolev); CloseScope; (* allocate own scope *) GMod[nofGMods] := mod; INC(nofGMods) END END InsertMod; VAR nofLMods: INTEGER; (* local module number; initially = 0 *) LMod: ARRAY MaxNofLMods OF Object; (* local module table *) PROCEDURE ReadMod(VAR mod: Object); VAR ref, key: LONGINT; name: Name; BEGIN ReadInt(ref); IF ref > 0 THEN (* first occurence *) ReadKey(key); ReadName(name); InsertMod(mod, name, key); LMod[nofLMods] := mod; INC(nofLMods) ELSE mod := LMod[-ref] END END ReadMod; Importing a whole module requires reading the module (by means of ReadMod) followed by reading all exported objects of this module (by means of ReadObjects). Together there are only a few additional rules to the general graph algorithm to be observed: 1. For types and modules , which both may be referenced several times within the same symbol file , a marking scheme and a translation table analogous to the general graph read/write algorithms is used. 2. For named types and modules , which both may occur in several symbol files , the primary instance always must be used and inserted into the local translation tables. If the primary instance already exists, the remaining data must be read but then discarded. Repeated occurence of the same module is detected using a global module table, repeated occurence of the same named type is detected using its module and the corresponding module scope. 3. As a consequence of the second rule, no other types must occur in the symbol file before name and module of a named type are specified. 2.4 Implementation Aspects Symbol File Representation As mentioned in the beginning, symbol files need not to be represented by a separate file. Actually, a symbol file independent of the corresponding object file mirrors the fact that in Modula-2 and other languages the definition and the implementation of a module were compiled separately: the definition module was compiled into a symbol file and the implementation module into an object file. If exported objects are marked in some way within the implementation module (as in Oberon), a definition module and hence a separate symbol file is not necessary. It is more adequate to produce only a single file during compilation, which contains all the necessary information about the module including its interface description. For example, the (conventional) symbol file may be appended to the object file. The advantages are obvious: only a single file has to be distributed, this file is always self-contained and therefore consistent and it is possible to directly generate the interface specification out of an object file with a Browser tool. Should it be necessary to inhibit public use of a module (e.g. because it supports low-level features), it is easy to remove the symbol file from the object file and distribute the interface-less object file only. Canonical Form for Symbol Files When a module has to be recompiled for some reason without a change in its interface, the old module key should be used again in the symbol file. Otherwise all depending modules would also be invalidated and would have to be recompiled. A simple method to check whether a module's interface has changed or not is to bytewise compare the new symbol file against the old one. If it has not changed, the old key can be reused. This comparison method requires very stable symbol files: e.g. the ordering of the exported objects should have no effect on the symbol file. This is clearly not fulfilled in the implementation described above (which we've chosen for simplicity), because objects are ordered in the linear list according to their occurence in the module. Desired is a kind of canonical form for symbol files. As we mentioned earlier, access to a special object in the symbol table should be as efficient as possible, so one could implement the table using binary trees instead of (unsorted) linear lists. Then, the objects could be written in alphabetical order to the symbol file using an in-order traversal of the tree. Symbol files in this canonical form are invariant to any changes in the ordering of a module's objects. Unfortunately this implies an undesired side effect which destroys the efficiency of binary trees: Importing a symbol file means inserting the imported objects in binary trees. Because the objects are alphabetically sorted, the symbol table tree degenerates to a linear list. Further, most of the objects in typical modules are known by import. In consequence, searching in the symbol table means actually searching in linear lists. Hence, the additional implementation effort for simple binary trees may be no longer justified. The situation may be improved by modifing the proposed canonical form. Instead of writing all objects in alphabetical order, groups of objects of the same kind (with the same mode) are written alphabetically: first all constants, then the variables, then types, and so on. Within each group the objects are ordered and the group ordering is also specified. This is another canonical form which is also invariant against any permutations of exported objects. When imported, the trees will not completely degenerate but are decomposed into several partial lists. The result is at least a better balanced tree. Note that the addional effort of traversing the tree in several passes (for each object group) is only necessary during export; the more time critical import procedure is not at all affected. Predefined Types In several languages there exist predefined types which are known to the compiler. Such types are always exported using a fixed reference number. Before any object is imported, they are inserted into the local structure table by the import procedure (see Appendix B). Increased Import Speed As measurements show (see below), when reading is bytewise, the import speed highly depends on the symbol file length. Besides strings, most of the data on a symbol file are integers. Hence, a principal goal should be to reduce the space used to write a single integer number. Integers occur frequently as tag or reference numbers and most of them are very small (with an absolute value less than 64). Nevertheless also bigger integers should be managed easily. The ReadInt and WriteInt procedures described in [Te90] allow integers to be written in a machine-independent format which uses only one byte per number in most cases. 2.5 Measurements The following measurements show lengths and reading times of symbol files. The basic modules of the Oberon System are choosen as a typical "module mix". The method described here (used in module CCT) is compared to the method used in the existing portable Oberon compilers (module OPT, see [Cre90]) which is essentially the same method as described in [Gu85]. To be fair, the CCT method was adjusted such that about the same information per object was written out as with OPT (e.g. the address of each parameter of a procedure) but using a compactifying WriteInt procedure. The measurements were made on a Ceres-2 computer [He88] with a NS32532 CPU running at 25 Mhz clock speed. In the left table the lengths of the symbol files in bytes are shown (OPT lengths are 100%). On average, the CCT symbol files are about 25% shorter than the corresponding OPT files. When the same symbol files are written using a non-compactifying WriteInt procedure in CCT (for addresses only), the files are about 8 % shorter (measurements not shown here). The right table shows the reading times in ms for each symbol file (OPT times are 100%). For this, each file was imported 100 times and only the pure file reading time without directory operations was measured and the average taken. The actual time spent in the import procedures is much higher because of directory accesses and may vary even for the same file. Tab. 2 Further analysis of the measurements shows what was presumed, namely a practically linear dependency between the file lengths and their reading times (Tab. 2); the linear regression coefficient r is nearly 1.0 for both methods. Although the average reading time per byte in CCT is larger than in OPT, the shorter file lengths made up for this difference. For comparison, the values for pure file reading are also shown (the reading time per byte is the average measured time for this file mix and is only valid for short files in general). In every case the file length completely dominates other factors, so to improve importing speed shorter symbol files have to be achieved. But nevertheless, such an effort is only justified if the file system offers some kind of caching strategy for files already open. Otherwise the reading time for such short files is negligible compared to the directory access time. The import and export procedures in CCT are about 20 % shorter in source code size than the OPT procedures (Tab. 2). 3. Summary As we have seen, the linearization algorithm for general graphs is very simple and easily adjusted to similar problems. In the example of symbol files it leads to a clean and understandable solution. The simplicity of the algorithm and the fact that only local invariants have to be considered, allows symbol files to be easily extended. The main difference between the algorithm described here and the one described in [Gu85] is that here a pre-order traversal instead of a post-order traversal is used. Using a post-order traversal, several postulates have to be guaranteed all the time, which complicate the algorithm. Nevertheless, cyclic references within types are a problem which has to be handled in a special way. Although the post-order algorithm requires no recursion for reading, the presented algorithm is faster in the average because symbol files are shorter and time used for reading in todays computers is determined mostly by the length of the files which are to be processed. Acknowledgements I would like to thank Josef Templ, Cuno Pfister, Clemens Szyperski and H. Mössenböck. Josef answered tricky questions about symbol files; he and Cuno worked as proof readers. H. Mössenböck added valuable comments to the modified algorithm in Appendix A. Not to forget, the entire paper was written using the excellent Write text editor of Clemens. References [Cre90] R. Crelier. OP2: A Portable Oberon Compiler . Computersysteme ETH Zürich, Technical Report No. 125, February 1990. [Gri90] R. Griesemer. Seneca - A Language for Numerical Computations on Vectorcomputers . Conpar 90 Proceedings, Volume on special technical contributions, Zürich, September 1990. [Gu85] J. Gutknecht. Compilation of Data Structures: A New Approach to Efficient Modula-2 Symbol Files . Computersysteme ETH Zürich, Technical Report No. 64, July 1985. [PfiHeTe91] C. Pfister, B. Heeb, J. Templ. Oberon Technical Notes . Companion paper. [ReMö86] P. Rechenberg, H. Mössenböck. An Algorithm for the Linear Storage of Dynamic Data Structures . Internal Paper, University of Linz, Austria, 1986. [Te90] J. Templ. SPARC-Oberon. User's Guide and Implementation. Computersysteme ETH Zürich, Technical Report No. 133, June 1990. [Wi85] N. Wirth. A Fast and Compact Compiler for Modula-2 . Computersysteme ETH Zürich, Technical Report No. 64, July 1985. [Wi88] N. Wirth. The Programming Language Oberon . Software - Practice and Experience, 18, 7, July 1988 and Computersysteme ETH Zürich, Technical Report No. 143, November 1990. Appendix A: A modified Linearization Algorithm When working with rather large graphs consisting of several thousands or even millions of nodes, it might be impractical to use a translation table NodeTab of about the same size as the graph for reading. If only multiple referenced nodes had to be stored in the NodeTab array, this translation table could be much smaller. This can be achieved by a split of the write phase into two subphases. As a desirable side effect, after writing, all preconditions for writing are established again, i.e. no unmarking is necessary. The modifications are described shortly: Writing : In the first pass all nodes are traversed and the mark field is used as reference counter (procedure Mark). For the mark field of every (reachable) node the following holds: In the second pass, every node that is referenced more than once (i.e. mark > 1) is written using a special tag and its negative node number is stored in its mark field (which now again fulfills the preconditions for writing). Reading: A negative node tag designates an already read node. A positive node tag specifies a node which occurs only once (tag is even) or a node which will be referenced again (tag is odd) and therefore must be stored in the NodeTab array. TYPE Node = POINTER TO NodeDesc; NodeDesc = RECORD mark: INTEGER; (* used for linearization; initially < 0 *) tag: INTEGER; (* determines node type; tag > 0 *) data: ARRAY M(tag) OF INTEGER; (* N depends on the node type (tag) *) link: ARRAY N(tag) OF Node (* M depends on the node type (tag) *) END; VAR NofNodes: INTEGER; (* node number; initially = 1 *) NodeTab: ARRAY MaxRefs OF Node; (* node table; contains each node referenced more than once *) PROCEDURE Mark(q: Node); (* Pass 1 *) (* precondition: q: q.mark < 0 *) VAR i: INTEGER; BEGIN IF q # NIL THEN IF q.mark < 0 THEN (* first occurence *) q.mark := 1; i := 0; WHILE i < Ntag DO Mark(q.link[i]); INC(i) END ELSE INC(q.mark) END END END Mark; (* postcondition: q: q is marked *) PROCEDURE WriteNode(q: Node); (* Pass 2 *) (* precondition: (q: q is marked) (NofNodes = 1) *) VAR i: INTEGER; BEGIN IF q = NIL THEN WriteInt(0) ELSIF q.mark < 0 THEN (* already marked *) WriteInt(q.mark) ELSE (* first occurence *) IF q.mark = 1 THEN (* node occurs only once *) WriteInt(q.tag*2); q.mark := -1 ELSE (* node is referenced several times *) WriteInt(q.tag*2 + 1); q.mark := -NofNode; INC(NofNodes) END; i := 0; WHILE i < Mtag DO WriteInt(q.data[i]); INC(i) END; i := 0; WHILE i < Ntag DO WriteNode(q.link[i]); INC(i) END END END WriteNode; (* postcondition: q: q.mark < 0 *) PROCEDURE ReadNode(VAR q: Node); (* precondition: (NodeTab[0] = NIL) (NofNodes = 1) *) VAR tag, i: INTEGER; BEGIN ReadInt(tag); IF tag <= 0 THEN (* node reference *) q := NodeTab[-tag] ELSE (* first occurence *) NEW(q, tag DIV 2); IF ODD(tag) THEN (* node is referenced several times *) NodeTab[NofNodes] := q; INC(NofNodes) END; i := 0; WHILE i < Mtag DO ReadInt(q.data[i]); INC(i) END; i := 0; WHILE i < Ntag DO ReadNode(q.link[i]); INC(i) END END END ReadNode; Appendix B: Import / Export in Detail In the following an extract of a table handler with the complete import / export procedures is shown. A few points are specific to this implementation and hence commented accordingly. CONST (* implementation restrictions *) MaxNofGMods = 32; MaxNofLMods = 24; MaxNofStructs = 256; (* object modes *) Undef* = 0; Scope = 1; Const* = 2; Type* = 3; Var* = 4; VarPar* = 5; XVar* = 6; IVar* = 7; Field* = 8; LProc* = 9; XProc* = 10; IProc* = 11; SProc* = 12; Mod* = 13; (* type forms *) Char* = 2; Bool* = 3; SInt* = 4; Int* = 5; LInt* = 6; Set* = 7; Real* = 8; LReal* = 9; Cmplx* = 10; LCmplx* = 11; NilTyp* = 13; NoTyp* = 14; Proc* = 15; String* = 16; Array* = 17; DynArr* = 18; Record* = 19; Pointer* = 20; TYPE Name* = ARRAY 32 OF CHAR; Object* = POINTER TO ObjectDesc; Struct* = POINTER TO StructDesc; ObjectDesc* = RECORD link*, next*: Object; (* objects are chained using next *) name* : Name; typ* : Struct; marked* : BOOLEAN; (* marked objects are exported *) leave* : BOOLEAN; mode* : INTEGER; (* identifies object kind *) mnolev* : INTEGER; (* module numbers are <= 0 *) a0*, a1* : LONGINT; (* object specific data *) b0*, b1* : LONGINT (* object specific data *) END; StructDesc* = RECORD ref : INTEGER; (* used as mark field *) form* : INTEGER; (* identifies structure kind *) obj* : Object; (* points to type object if it exists *) len*, size*: LONGINT; base* : Struct; (* result-, element-, base- or pointee type *) link* : Object (* record scope *) END; VAR system, universe: Object; (* predefined scopes *) topScope*, undefObj*: Object; (* current topScope, error object *) firstStructRef: INTEGER; nofGMods*: INTEGER; GMod*: ARRAY MaxNofGMods OF Object; (* global module table *) (* predefined types *) undefTyp*, noTyp*, stringTyp*, boolTyp*, charTyp*, sIntTyp*, intTyp*, lIntTyp*, setTyp*, realTyp*, lRealTyp*, cmplxTyp*, lCmplxTyp*: Struct; PROCEDURE err(no: INTEGER); (* Displays an error message *) (* general table handling *) PROCEDURE NewObject*(VAR obj: Object; VAR name: ARRAY OF CHAR; mode: INTEGER); (* Creates a new object and initializes its fields *) PROCEDURE NewStruct*(VAR str: Struct; form: INTEGER); (* Creates a new structure and initializes its fields *) PROCEDURE OpenScope*(VAR scope: Object; mnolev: INTEGER); (* Opens a new scope if scope is NIL; otherwise the old scope is reopened *) PROCEDURE CloseScope*; (* Closes topScope *) PROCEDURE Insert*(VAR obj: Object; name: ARRAY OF CHAR; mode: INTEGER); VAR p, q: Object; BEGIN p := topScope; q := topScope.next; WHILE q # NIL DO IF q.name = name THEN err(1) END; p := q; q := q.next END; NewObject(obj, name, mode); obj.mnolev := topScope.mnolev; p.next := obj END Insert; (* import table handling *) PROCEDURE InsertMod*(VAR mod: Object; VAR name: ARRAY OF CHAR; key: LONGINT); VAR i: INTEGER; BEGIN i := 0; WHILE (i < nofGMods) & (name # GMod[i].name) DO INC(i) END; IF i < nofGMods THEN (* module already imported *) mod := GMod[i]; IF mod.b0 # key THEN err(150) (* key inconsistency *) END ELSE NewObject(mod, name, Mod); (* must not be visible in global scope *) mod.b0 := key; mod.mnolev := -nofGMods; OpenScope(mod.link, mod.mnolev); CloseScope; (* allocate own scope *) IF nofGMods < MaxNofGMods THEN GMod[nofGMods] := mod; INC(nofGMods) ELSE err(227) (* to many imported modules *) END END END InsertMod; PROCEDURE InsertImport(VAR obj: Object; scope: Object); VAR p, q: Object; BEGIN p := scope; q := scope.next; WHILE q # NIL DO IF q.name = obj.name THEN obj := q; RETURN END; p := q; q := q.next END; obj.mnolev := scope.mnolev; p.next := obj END InsertImport; (* import *) PROCEDURE OpenRider(VAR R: Files.Rider; name: ARRAY OF CHAR; VAR res: INTEGER); (* Sets a rider to the beginning of the symbol file *) PROCEDURE Import*(VAR substName, impName, selfName: ARRAY OF CHAR); VAR R: Files.Rider; mod0, mod: Object; res, nofLMods, nofStructs: INTEGER; LMod: ARRAY MaxNofLMods OF Object; LStruct: ARRAY MaxNofStructs OF Struct; PROCEDURE^ ReadStruct(VAR typ: Struct); PROCEDURE ReadInt(VAR i: LONGINT); (* Reads integers written in a compacted form [Te90] *) VAR n: LONGINT; s: INTEGER; x: CHAR; BEGIN s := 0; n := 0; Files.Read(R, x); WHILE ORD(x) >= 128 DO INC(n, ASH(ORD(x) - 128, s)); INC(s, 7); Files.Read(R, x) END; i := n + ASH(ORD(x) MOD 64 - ORD(x) DIV 64 * 64, s) END ReadInt; PROCEDURE ReadName(VAR name: ARRAY OF CHAR); (* Reads a name terminated with 0X *) PROCEDURE ReadString(VAR pos, len: LONGINT); (* Reads a string terminated with 0X into the scanner string buffer *) PROCEDURE ReadKey(VAR x: LONGINT); (* Reads an integer in uncompacted form *) PROCEDURE ReadMod(VAR mod: Object); VAR ref, key: LONGINT; name: Name; BEGIN ReadInt(ref); IF ref > 0 THEN (* first occurence *) ReadKey(key); ReadName(name); IF name = selfName THEN err(49) END; InsertMod(mod, name, key); IF nofLMods < MaxNofLMods THEN LMod[nofLMods] := mod; INC(nofLMods) ELSE err(227) (* to many imported modules *) END ELSE mod := LMod[-ref] END END ReadMod; PROCEDURE ReadFields(VAR field: Object); VAR obj, last: Object; name: Name; BEGIN field := NIL; LOOP ReadName(name); IF name = "" THEN EXIT END; NewObject(obj, name, Field); ReadStruct(obj.typ); ReadInt(obj.a0); IF field = NIL THEN field := obj ELSE last.next := obj END; last := obj END END ReadFields; PROCEDURE ReadFP(VAR par: Object; VAR nofArgs: LONGINT); VAR obj, last: Object; typ: Struct; n, a0, a1: LONGINT; name: Name; BEGIN par := NIL; ReadInt(nofArgs); n := nofArgs; WHILE n > 0 DO ReadName(name); ReadStruct(typ); ReadInt(a0); ReadInt(a1); IF ODD(a1) THEN NewObject(obj, name, VarPar) ELSE NewObject(obj, name, Var) END; obj.typ := typ; obj.a0 := a0; obj.a1 := a1 DIV 2; IF par = NIL THEN par := obj ELSE last.next := obj END; last := obj; DEC(n) END END ReadFP; PROCEDURE ReadStruct(VAR typ: Struct); VAR form: LONGINT; htyp: Struct; name: Name; obj, mod: Object; BEGIN ReadInt(form); IF form <= 0 THEN (* struct reference or NIL *) typ := LStruct[-form] ELSE (* first occurence *) NewStruct(htyp, SHORT(form)); ReadName(name); IF name # "" THEN (* named type *) NewObject(obj, name, Type); obj.marked := TRUE; obj.typ := htyp; htyp.obj := obj; ReadMod(mod); InsertImport(obj, mod.link); typ := obj.typ ELSE typ := htyp END; IF nofStructs < MaxNofStructs THEN LStruct[nofStructs] := typ; INC(nofStructs) ELSE err(228) (* to many imported types *) END; ReadStruct(htyp.base); CASE form OF | Proc: OpenScope(htyp.link, 0); ReadFP(htyp.link.next, htyp.len); CloseScope; htyp.size := 1 | Array: ReadInt(htyp.len); ReadInt(htyp.size) | DynArr: ReadInt(htyp.size) | Record: OpenScope(htyp.link, 0); ReadFields(htyp.link.next); CloseScope; ReadInt(htyp.size); IF htyp.base # NIL THEN htyp.len := htyp.base.len+1; htyp.link.link := htyp.base.link ELSE htyp.len := 0 END | Pointer: typ.size := 1 END END END ReadStruct; PROCEDURE ReadObjects(scope: Object); VAR mode: LONGINT; obj: Object; typ: Struct; name: Name; BEGIN LOOP (* read all objects *) ReadInt(mode); IF mode = Undef THEN EXIT ELSIF mode = Type THEN ReadName(name); IF name # "" THEN (* alias type *) NewObject(obj, name, Type); ReadStruct(obj.typ); InsertImport(obj, scope) ELSE (* other types *) ReadStruct(typ) END ELSE ReadName(name); NewObject(obj, name, SHORT(mode)); ReadStruct(obj.typ); IF mode = Const THEN IF obj.typ.form = String THEN ReadString(obj.b0, obj.b1) ELSIF obj.typ.form = LReal THEN ReadInt(obj.b0); ReadInt(obj.b1) ELSE ReadInt(obj.b0) END ELSIF obj.mode = Var THEN ReadInt(obj.a0); obj.mode := XVar ELSIF obj.mode = LProc THEN OpenScope(obj.link, 0); ReadFP(obj.link.next, obj.b0); CloseScope; obj.mode := XProc ELSIF obj.mode = IProc THEN OpenScope(obj.link, 0); ReadFP(obj.link.next, obj.b0); CloseScope END; InsertImport(obj, scope) END END END ReadObjects; PROCEDURE EnterStruct(typ: Struct); (* Enters a predefined type into the LStruct table *) BEGIN IF impName = "SYSTEM" THEN Insert(mod, impName, Mod); mod.link := system ELSE OpenRider(R, impName, res); IF res < 0 THEN (* no error occured *) LStruct[0] := NIL; EnterStruct(stringTyp); EnterStruct(undefTyp); EnterStruct(noTyp); EnterStruct(boolTyp); EnterStruct(charTyp); EnterStruct(sIntTyp); EnterStruct(intTyp); EnterStruct(lIntTyp); EnterStruct(setTyp); EnterStruct(realTyp); EnterStruct(lRealTyp); EnterStruct(cmplxTyp); EnterStruct(lCmplxTyp); nofLMods := 0; nofStructs := firstStructRef; ReadMod(mod0); ReadObjects(mod0.link); Insert(mod, substName, Mod); mod.b0 := mod0.b0; mod.mnolev := mod0.mnolev; mod.link := mod0.link ELSE err(res) END END END Import; (* export *) PROCEDURE Export* (mod: Object; VAR buf: ARRAY OF CHAR; VAR len: LONGINT; VAR new: BOOLEAN); VAR pos: LONGINT; nofLMods, nofStructs: INTEGER; PROCEDURE^ WriteStruct(typ: Struct); PROCEDURE Write(ch: CHAR); BEGIN buf[pos] := ch; INC(pos) END Write; PROCEDURE WriteInt(i: LONGINT); (* Writes integers in a compacted form [Te90] *) BEGIN WHILE (i < -64) OR (i > 63) DO Write(CHR(i MOD 128 + 128)); i := i DIV 128 END; Write(CHR(i MOD 128)) END WriteInt; PROCEDURE WriteName(VAR name: ARRAY OF CHAR); (* Writes a name terminated with 0X *) PROCEDURE WriteString(pos: LONGINT); (* Writes the string terminated with 0X at position pos of the scanner string buffer *) PROCEDURE WriteKey(x: LONGINT); (* Writes an integer in uncompacted form *) PROCEDURE WriteMod(mod: Object); BEGIN IF mod.b1 < 0 THEN (* first occurence *) mod.b1 := nofLMods; INC(nofLMods); WriteInt(Mod); WriteKey(mod.b0); WriteName(mod.name) ELSE WriteInt(-mod.b1) END END WriteMod; PROCEDURE WriteFields(field: Object); BEGIN WHILE field # NIL DO IF field.marked THEN WriteName(field.name); WriteStruct(field.typ); WriteInt(field.a0) END; field := field.next END; Write(0X) END WriteFields; PROCEDURE WriteFP(par: Object; nofArgs: LONGINT); BEGIN WriteInt(nofArgs); WHILE nofArgs > 0 DO WriteName(par.name); WriteStruct(par.typ); WriteInt(par.a0); IF par.mode = VarPar THEN WriteInt(par.a1*2 + 1) ELSE WriteInt(par.a1*2) END; par := par.next; DEC(nofArgs) END END WriteFP; PROCEDURE WriteStruct(typ: Struct); VAR name: Name; BEGIN IF typ = NIL THEN WriteInt(0) ELSIF typ.ref > 0 THEN (* already marked *) WriteInt(-typ.ref) ELSE (* first occurence *) WriteInt(typ.form); typ.ref := nofStructs; INC(nofStructs); IF typ.obj # NIL THEN (* named type *) name := typ.obj.name; IF ~typ.obj.marked THEN (* invisible type *) name[0] := CHR(ORD(name[0]) - ORD("@")) END; WriteName(name); WriteMod(GMod[-typ.obj.mnolev]) ELSE Write(0X) END; WriteStruct(typ.base); CASE typ.form OF | Proc: WriteFP(typ.link.next, typ.len) | Array: WriteInt(typ.len); WriteInt(typ.size) | DynArr: WriteInt(typ.size) | Record: WriteFields(typ.link.next); WriteInt(typ.size) | Pointer: END END END WriteStruct; PROCEDURE WriteObjects(obj: Object); BEGIN WHILE obj # NIL DO IF obj.marked THEN WriteInt(obj.mode); IF (obj.mode # Type) OR (obj.typ.obj # obj) THEN (* no-type or alias type *) WriteName(obj.name) ELSE (* other type *) Write(0X) END; WriteStruct(obj.typ); IF obj.mode = Const THEN IF obj.typ.form = String THEN WriteString(obj.b0) ELSIF obj.typ.form = LReal THEN WriteInt(obj.b0); WriteInt(obj.b1) ELSE WriteInt(obj.b0) END ELSIF obj.mode = Var THEN WriteInt(obj.a0) ELSIF obj.mode IN {LProc, IProc} THEN WriteFP(obj.link.next, obj.b0) END END; obj := obj.next END; WriteInt(Undef) (* termination tag *) END WriteObjects; PROCEDURE Compare(VAR buf: ARRAY OF CHAR; len: LONGINT; VAR new: BOOLEAN); VAR res: INTEGER; i: LONGINT; R: Files.Rider; ch: CHAR; prefix: ARRAY 5 OF CHAR; BEGIN OpenRider(R, mod.name, res); new := TRUE; IF res < 0 THEN Files.ReadBytes(R, prefix, LEN(prefix)); Files.Read(R, ch); i := LEN(prefix); WHILE (i < len) & (buf[i] = ch) DO Files.Read(R, ch); INC(i) END; IF i = len THEN (* same symbol table => use old key *) i := 0; WHILE i < LEN(prefix) DO buf[i] := prefix[i]; INC(i) END; new := FALSE END END END Compare; BEGIN pos := 0; nofLMods := 0; nofStructs := firstStructRef; WriteMod(mod); WriteObjects(mod.link.next); Compare(buf, pos, new) END Export;